Clustering with Propagated Constraints

نویسنده

  • Eric Robert Eaton
چکیده

Title of Thesis: Clustering with Propagated Constraints Eric Robert Eaton, Master of Science, 2005 Thesis directed by: Dr. Marie desJardins, Assistant Professor Department of Computer Science and Electrical Engineering Background knowledge in the form of constraints can dramatically improve the quality of generated clustering models. In constrained clustering, these constraints typically specify the relative cluster membership of pairs of points. They are tedious to specify and expensive from a user perspective, yet are very useful in large quantities. Existing constrained clustering methods perform well when given large quantities of constraints, but do not focus on performing well when given very small quantities. This thesis focuses on providing a high-quality clustering with small quantities of constraints. It proposes a method for propagating pairwise constraints to nearby instances using a Gaussian function. This method takes a few easily specified constraints, and propagates them to nearby pairs of points to constrain the local neighborhood. Clustering with these propagated constraints can yield superior performance with fewer constraints than clustering with only the original user-specified constraints. The experiments compare the performance of clustering with propagated constraints to that of established constrained clustering algorithms on several real-world data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Active Temporal Constrained Clustering

We introduce a novel interactive framework to handle both instance-level and temporal smoothness constraints for clustering large temporal data. It consists of a constrained clustering algorithm which optimizes the clustering quality, constraint violation and the historical cost between consecutive data snapshots. At the center of our framework is a simple yet effective active learning techniqu...

متن کامل

Value, Cost, and Sharing: Open Issues in Constrained Clustering

Clustering is an important tool for data mining, since it can identify major patterns or trends without any supervision (labeled data). Over the past five years, semi-supervised (constrained) clustering methods have become very popular. These methods began with incorporating pairwise constraints and have developed into more general methods that can learn appropriate distance metrics. However, s...

متن کامل

Generating Optimal Timetabling for Lecturers using Hybrid Fuzzy and Clustering Algorithms

UCTTP is a NP-hard problem, which must be performed for each semester frequently. The major technique in the presented approach would be analyzing data to resolve uncertainties of lecturers’ preferences and constraints within a department in order to obtain a ranking for each lecturer based on their requirements within a department where it is attempted to increase their satisfaction and develo...

متن کامل

Semi-supervised clustering via multi-level random walk

A key issue of semi-supervised clustering is how to utilize the limited but informative pairwise constraints. In this paper, we propose a new graph-based constrained clustering algorithm, named SCRAWL. It is composed of two random walks with different granularities. In the lower-level random walk, SCRAWL partitions the vertices (i.e., data points) into constrained and unconstrained ones, accord...

متن کامل

Multiresolution genetic clustering algorithm for texture segmentation

This work plans to approach the texture segmentation problem by incorporating genetic algorithm and K-means clustering method within a multiresolution structure. As the algorithm descends the multiresolution structure, the coarse segmentation results are propagated down to the lower levels so as to reduce the inherent class–position uncertainty and to improve the segmentation accuracy. The proc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005